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We probe the numerical errors made in renormalization group calculations 

■ by varying slightly the rescaling factor of the fields and rescaling back in order 
to get the same (if there were no round-off errors) zero momentum 2-point 
function (magnetic susceptibility). The actual calculations were performed 
with Dyson's hierarchical model and a simplified version of it. We compare 
the distributions of numerical values obtained from a large sample of rescal- 

• ing factors with the (Gaussian by design) distribution of a random number 

■ generator and find significant departures from the Gaussian behavior. In ad- 
. dition, the average value differ (robustly) from the exact answer by a quantity 
I which is of the same order as the standard deviation. We provide a simple 

"j^ I model in which the errors made at shorter distance have a larger weight than 

those made at larger distance. This model explains in part the non-Gaussian 
features and why the central-limit theorem does not apply. 
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I. INTRODUCTION 



In the standard model of electro-weak and strong interactions, all the masses are gener- 
ated by couplings to a scalar Higgs particle. However, one might be inclined to think that, 
ultimately, gravitational interactions should be responsible for mass generation. If there is 
no new physics between the electro-weak scale and the Planck scale and if we know nothing 
about the Planck scale interactions, the best we can do is to use an effective scalar La- 
grangian with a cut-off of the order of the Planck scale. In this approach, one is confronted 
with the hierarchy problem |[^, which means that in order to keep a physical mass very 
small in cut-off units, one needs to fine-tune some parameter of the theory (usually the bare 
mass) with an incredible precision. In four dimensions, this fine-tuning can be understood 
in terms of perturbative quadratic divergences. 

In the renormalization group (RG) approach of field theory 0, the need for fine-tuning 
can be related to the existence of unstable directions in a way which is independent of 
perturbation theory in a particular dimension. In this approach, the renormalized mass 
expressed in cut-off units decreases exponentially with the number of iterations spent near 
an unstable fixed point. In order to keep the mass small, one needs to fine-tune the initial 
parameter in order to start close to the critical hypersurface. This hypersurface in the space 
of bare Lagrangians separates the symmetric phase from the symmetry broken phase and 
contains the unstable fixed point. 

In the following, we use the statistical mechanics language where criticality is approached 
by tuning the inverse temperature j3 close to its critical value (3c- In terms of this parameter, 
the ratio of the physical mass m and the ultra-violet (UV) cutoff A is 



where 7 is the critical exponent associated with the magnetic susceptibility. In four dimen- 
sions, 7 = 1. If we take m = 100 GeV, a typical electroweak scale, and A = lO^^GeV of the 
order of the Planck mass, we need to fine-tune (3 with 34 digits. 

The main virtue of the RG approach is to introduce some hierarchy in the information 
contained in the partition function. At each iteration, the information relevant to under- 
stand the large distance behavior is amplified, while the rest of the information is discarded 
according to its degree of irrelevance. However if some "noise" is introduced in this process, 
for instance as numerical errors in the calculation, the error in the relevant direction will be 
amplified too. This may lead to situations where the amplified errors wipe out the careful 
fine-tuning and one obtains meaningless results. 

In this article, we discuss the numerical errors with examples where the RG transforma- 
tion can be calculated for many iterations. The models used for our calculations are Dyson's 
hierarchical model [Q], for which very accurate methods of calculation |^,|^ have been re- 
cently designed, and a simplified version of this model for which the nonlinear aspects 
of the interpolation between fixed points is understood in great detail. For comparison, we 
also considered a random number generator designed to produce a Gaussian distribution. 

The question of numerical errors in RG calculation has been briefly discussed in recent 
publications. The mechanism of error amplification was identified in section IV of Ref. 

In these calculations, we probed the numerical errors by slightly varying the parameter 
(denoted s hereafter) used to rescale the fields at each iteration of the RG transformation 



m/A ~ (/3, - py/' , 
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and exploiting the fact that the physical quantities such as the magnetic susceptibilty x 
are independent of the choice of rescaling. This procedure is reviewed in section |T|. We 
predicted that if 6 represents a typical round-off error (of the order of 10~^^ in double 
precision calculations), the relative difference between the numerical values obtained for two 
slightly different values of s should be of the order 

^x/x-m-^y ■ (1-2) 

This behavior was observed in good approximation for a wide range of (3 (see Fig. 4 of Ref. 
^). A consequence of this result is that at fixed 5 and m, there is a maximum UV cutoff of 
the order of mb~^l'^ beyond which the result of a calculation becomes totally meaningless. 

In the models we have considered, the exact rescaling factor s for which the RG transfor- 
mation has a non-trivial fixed point (denoted s^xact hereafter) is exactly the same as for a free 
theory. In models with nearest neighbor interactions, this is only approximately the case. 
The free value is corrected whenever the critical exponent ?7 is non-zero. Since in most cases, 
?7 is only known with a finite accuracy, one is naturally led to consider a range of values for 
s. In section Ill.b of Ref. p we considered the distribution of values of the susceptibihty 
for 2000 values of s. To our surprise, we found that the errors did not average out to the 
correct value which was calculated using higher precision arithmetic. In addition, we found 
that the distribution of errors was not Gaussian. In the present paper, we readdress these 
questions with larger statistics and we compare the results with other models. The technical 
details of our analysis are outlined in section ||. 

Our first result is that the average error is of the same order of magnitude as the standard 
deviation of the distribution. There is a systematic bias in the calculations and one does 
not gain anything by increasing the statistics. This bias depends on the peculiarities of 
arithmetic round-offs and we have not attempted to model it from a "microscopic" point of 
view. It should however be noticed that the average error is "compiler robust" and that it 
should in principle be understandable at least for the simplified model for which only a few 
arithmetic operations are needed at each iteration of the RG transformation. 

Another robust result is that all our RG calculations show significant departures from 
Gaussian behavior. These departures can be estimated in terms of the skewness and kurtosis 
coefficients. In all the RG calculations these coefficients are at least one order of magnitude 
larger than the corresponding quantities for the Gaussian random number generator. 

How do the repeated errors fail to erase the details of the individual distribution and 
produce a Gaussian distribution as in the central limit theorem? We have considered two 
possible explanations. The first one is that the distribution of errors changes from step to 
step. A detailed study shows that there are indeed small variations in these distributions, 
however this might not be the most important effect. The main reason why the central 
limit does not apply seem to be that the errors are added with unequal weight. The errors 
made during the initial iterations are more amplified than errors made at later stage. As 
a consequence, the "weighted average" inherits the skewness and kurtosis of the individual 
distribution. Since these have "short tails", so does the distribution of errors, unlike non- 
Gaussian distributions found in the study of turbulence which have "long tails" . 

Our results impose limitations on numerical approaches of scalar field theory, however 
these are not drastic. The main result is that as far as the average of low powers of the total 
field is concerned, a hierarchy A/m requires a number of significant digits proportional to 
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Log(A/m), which is not prohibitive for A/m ~ 10 



II. TECHNICAL OUTLINE OF THE PAPER 

The renormahzation group (RG) approach of scalar field theory, the hierarchy problem 
and our strategy to probe the numerical errors are reviewed in generic terms in section |T|. 
The specific models used for our calculations are presented in section 

In section 0, we consider seven samples with 10^ data points, six from RG calculations 
and one designed to produce a Gaussian distribution. It should be noted that in all cases 
randomness is due to sensitive dependence on the initial conditions. We show that the depar- 
ture from Gaussian behavior of the distributions of RG values are significantly larger than 
in the case of the random number generator designed to produce a Gaussian distribution. 
We also show that the errors spread differently when s < Sexact and s > Sexact, indicating 
that the sample should be "resolved" into subsamples with different distributions. 

Subsamples are analyzed quantitatively in section ^ in terms of "uniformity indicators" 
designed to establish correlations between the values of s and the moments of the distribu- 
tion. These indicators take values close to 1 when the sample is obtained from independent 
and identically distributed random variables. It is found that samples with only s < Sexact 
or only s > Sexact have much better uniformity than the the samples combining both sets 
of values. In the case of the simplified model, the samples with only s < Sexact or only 
s > Sexact havc valucs of the indicators which seem consistent with being samples coming 
from independent and identically distributed random variables. For the hierarchical model, 
the analogous samples need further resolution. 

It should be noted that in general one expects fluctuations of the order of (spread of the 
parent distribution) /((size of the sample) ^/^). Since we use large samples, these fluctuations 
are expected to be small. Compared to this small scale, the variations of the estimators for 
s < Sexact and s > Sexact are large. However, they are still relatively small when compared 
to the spread itself. Consequently, it still makes sense to talk in an approximate way about, 
for instance, the mean of the distribution. To take an example, one can look at Table I and 
see that the average errors for s < s exact (803) and s > Sexact (1297) differ by approximately 
one quart of the standard deviation of the any of the two samples. Consequently, we can 
still say that for any value of s, the error is approximately 1000. These remark should be 
kept in mind while we use expressions such as "the moments of a distribution" . 

A specific model where the errors made during the initial iterations are more amplified 
than errors made at later stage is provided in section [VI I| . In the conclusions, we discuss 
the limitations that our results impose on numerical approaches of scalar field theory. The 
situation is compared to what happens in chaotic dynamical systems and in the study of 
turbulence. 

HI. GENERAL STRATEGY TO PROBE THE NUMERICAL ERRORS 

Before getting into the specifics of model calculations, we would like to describe in 
general terms, our method to probe the numerical errors. The main points of this section 
are the following. First, the RG transformation involves a rescaling of the sum of the 
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fields in boxes of increasing sizes. In general, this rescaling factor is only known with a 
finite accuracy and so some range of values should in general be considered. Second, the 
physical quantities are independent of the choice of this rescaling factor. However in practical 
numerical calculations, the arithmetic errors are different for different choices of rescaling 
factor. Consequently, we can use large samples of rescaling factors to obtain a statistical 
distribution of these errors. We now proceed to discuss these points in the case of a generic 
scalar field theory. 

Let us consider a scalar model in D-dimensions with a lattice spacing oq. We first 
integrate the fields in blocks of side ba^ while keeping the sum of the fileds in the block 
constant. We then divide the sums of the fields by a factor and treat them as our 

new field variables. The procedure defines a discrete renormalization group transformation 
provided that the scale factor b is real number strictly larger than 1. 

The critical hypersurface (in the space of bare Lagrangians) is given as the stable manifold 
(e.g. the basin of attraction) of a non-trivial fixed point of this transformation. The stable 
manifold can be reached by considering a family of models indexed by a parameter which 
can be tuned in order to cross the stable manifold. In field theory context, one usually 
pick the bare mass to accomplish this purpose. In the statistical mechanics formulation, the 
inverse temperature (3 can be tuned to its critical value jSc which is a function of the other 
interactions. This notation will be used later. 

The information that we are keeping during the renormalization group transformation is 
encoded in the average values of all the integer powers of the sum of the fields in the blocks. 
We call these average values the "zero momentum Green's functions at finite volume" . This 
set of values can be thought of as an element of an infinite vector space. Near the fixed 
point, we can use the eigenvectors of the linearized transformation as a basis. As far as 
we are close to the fixed point, the average values of the powers of the rescaled total field 
stay approximately unchanged after one transformation. However at each iteration, the 
components in the eigendirections are multiplied by the corresponding eigenvalue. In the 
following, we discuss the case where there is only one relevant direction, in other words, only 
one eigenvalue larger than 1. We call this eigenvalue A. 

After repeating the renormalization group transformation n times, we have replaced 6^" 
sites by one site and associated a block variable with it. If at this point we neglect the 
interactions among the blocks of size 6^" and larger, we can calculate the finite volume 
volume Green's functions. For the sake of definiteness, we only discuss the case of the 
two point function. In the statistical mechanics language, the zero-momentum two point 
function is called the magnetic susceptibility. We define the finite volume susceptibility Xn 
as the average value of the square of the sum of all the (unrescaled) fields divided by the 
number of sites 6^". 

From the above discussion, we estimate 

Xn ^ b<''-'^\K, + ir2A"(/3. - 13)) (3.1) 

The power of b comes from rescaling back two powers of the fields to their original values, 
together with the division by the number of sites. The constant Ki is the constant value of 
the average of the square of the rescaled sum of the fields at the fixed point. The constant 
K2 depends on the way the critical hypersurface is approached when [3 is varied close to /?c. 
We want to emphasize that Eq. (|3.1| ) is valid only if the linearization procedure is 
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applicable, in other words if A"(/5c — P) << 1- On the other hand, when n reaches some 
critical value n* such that 

A"*(/5,-/5)^l, (3.2) 

non-linear effects become important and the sign of {jSc — (3) becomes important. In the 
following we will consider exclusively the case of the symmetric phase which is simpler. For 
a discussion in the case of the broken symmetry phase {(3 > jSc), the reader may consult 
Ref. 0. If P < Pc, the value of x starts stabilizing when n gets of the order of n*. As a 
consequence, 

^^^b-*i2-v) ^ (3.3) 

with 

7 = (2 - r])lnb/\nX . (3.4) 

When n gets larger, the high-temperature fixed point is reached rapidly. This fixed point is 
completely attractive. The irrelevant directions are manifested by volume effects decreasing 
like A model calculation of these effects can be found in Ref. 0]. 

For most models studied in the literature, the exact rescaling factor s = b^'^'^^~^^^'^ is 
not known with perfect accuracy. For instance in a Monte Carlo calculation for values 
of the couplings minimizing the subleading corrections, the values of r] obtained (for two- 
components in 3 dimensions) is ^ 0.0381 with errors of order 2 in the last quoted digit. 
Since at the end of the calculation we are rescaling back to the original field variables, this 
uncertainty would not affect the physical results provided that we were able to carry the 
integrations exactly. If the integrations are carried numerically, the arithmetic operations 
are performed differently and choosing a sample of values for the rescaling factor s close to 
the approximate values can be used to obtain a statistical sample of the numerical errors. 

We will thus consider values of the rescaling factor s of the form s exact + S, where s exact is 
the value for which there is an exact fixed point. In some sense 6 can be seen as the "seed" 
of a random number generator. We will now perform sample calculations for two models 
and compare our results with the results obtained from a set of seeds for a random number 
generator designed to produce approximate Gaussian distributions. 

IV. MODEL CALCULATIONS 

In this section, we describe in detail the three procedures used to obtain the data analyzed 
in the following sections. We discuss the hierarchical model [0] (subsection A), a simplified 
version of it (subsection B) and a random number generator which represents our naive 
expectations for the two other cases (subsection C). 

A. The hierarchical model 

The choice of the hierarchical model allows easy calculations with a controllable accu- 
racy. In order to avoid repetitions, we refer the reader to Ref. 0] for a more systematic 
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presentation. The main interest of this model is that only the local potential (or equiva- 
lently, the part of the measure which factorizes into a product of identical local functions 
which are called "the local measure" later) is affected by the RG transformation while the 
"kinetic" part is left invariant. 

The block-spin procedure can then be summarized by a simple integral formula (Eq. 
(2.2) in Ref. Q) for the local measure. Taking the Fourier transform and rescaling the sum 
of the fields in the block by an arbitrary rescaling factor 1/s one obtains the recursion 
relation 

Rn+iik) = a+iexp(-l/3(^.2)"+i^)(i?„(^))2 . (4.1) 

The parameter c is set equal to 2^~^/^ in order to approximate a D-dimensional model with 
nearest neighbor interactions. In the following the value D = 3 will be used. 

We fix the normalization constant Cn is such way that Rn{0) = 1. Rn{k) has then a 
direct probabilistic interpretation. If we call M„ the total field 4>x inside blocks of side 
2" and < ... >„ the average calculated without taking into account the interactions among 
different blocks of this size find 

We see that the Fourier transform of the local measure after n iterations generates the zero- 
momentum Green's functions calculated with 2" sites. In particular, we are interested in 
calculating the finite volume susceptibility 

,„ = l(^|l!2. = -2a„,4r. (4.3) 

As far as we are only interested in the calculation of < (Mn)'^'^ >„, the choice of s 
is a matter of convenience. For the calculations in the high temperature phase (symmetric 
phase) not too close to the critical points, or high temperature expansions the choice s = \f2 
is natural On the other hand, the the choice of rescaling factor s = 2c~^/^ makes the 
explicit dependence on n in Eq. ( [4.1|) disappear. For any other value of s, the map is 
somehow analogous to a differential equation with explicitly time-dependent coefficients. 
What we call "the RG transformation" in section |ITT| , is Eq. (|4.1| ) with s = 2c^^/^. It 



corresponds to the values b = 2^/^, r/ = and s = 6(2+^)/^ of the parameters defined in 
section ^TT[ For this value of s, Eq. (|4.1| ) has non-trivial fixed point [jl0| which seems to be 
unique p. 

We have calculated the susceptibility for the Ising measure {Ro{k) = cos{k)) with D = 3 
(i.e. c = 2^/^) and (3 = (3c — 10~^. The calculations have been performed using double 
precision Fortran. Using the approximate error given in Eq. ( |1.2| ), we see that 8 out of the 
16 significant digits of x should be correct for (3 = (3c — 10~^. Since we are in the high- 
temperature phase, Xn stops growing after approximately n* ~ 52 iterations (see section 



rT| and Ref. [g for details). The 16 digits of Xn are completely stabilized after n = 140 
iterations. We call this stable value x{s) where s refers to the value of s in Eq. ( [4.1| ) 
used for the numerical calculation. The calculations have performed using dimensional 
approximations of degree Imax'- 
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R^(k) = 1 + a„,iA;2 + an,2k^ + ... + an,;^.^^'™- , (4.4) 

for which the recursion formula for the an,m reads : 

^ Si— (Ep+,=^an.pan,,)[(20!/(/-m)!(2m)!](c/4)^[-(l/2)/3]'- 

Ei=r(Ep+,=.an,pan,,)[(2/)!//!](c/4)'[-(l/2)/5]' ^ ' ^ 

We found that the 16 digits of Xn were completely stabilized for l^ax = 50. 

We have used the set of values s = 2/ i/cim x 10~^ with m = 1, . . . 10^. We recorded the 
difference of xi^) defined above with respect to the accurate value x = 5.2316268857268 x 
10^°. This value was obtained by performing the calculation using higher precision arithmetic 
(namely 30 digits). We have checked that this accurate result is insensitive to changes in s. 
The results are discussed in the next sections. 



B. A simplified model 

As noticed in Ref. [Q, for [3 < [3c and n » n* Eq. (4J.) imples the approximate behavior 



X„+i^Xn + (/3/4)(c/2)"+ix' • (4.6) 
This map has been studied on its own in the rescaled form 

K+i = {c/2)K + (1 - c/2)hl . (4.7) 

This map can be seen as a drastically simplified version of the hierarchical model. One 
can check that if < /iq < 1, limjj_>oo(2/c)"/i„ is finite. This limit plays the role of the 
susceptibility in the following and can be calculated accurately by combining dual expansions 
i- 

Eq. ( |4.7| ) has an unstable fixed point aX h = 1 with eigenvalue A = 2 — c/2. We have 
required A = 1.427, approximately as for the hierarchical model with D = 3, in order to 
keep the value of n* the same. Consequently, the value of c used in Eq. (|4.71 ) is not the same 
as the value of c for the hierarchical model with D = 3. 

We have introduced the rescaling factor s through the redefinition 

a„ = {s^/AY^K , (4.8) 
in terms of which the map becomes 

a„+i = (2/s2)a„ + (1 - c/2){s^c/4:r-X (4-9) 



After calculating, a„ one can always return to hn using Eq. ( ^.8| ). If we had the chance to 
use exact arithmetic the expression would be independent of s. 

We have performed calculations for ho = = 1 — 10~^. We found that 150 iterations were 
sufficient to stabilize lim„_>oo(2/c)"'/i„. We have calculated this value for s = 2/ ^/c±mxlO~^ 
with m = 1, . . . 10^, as in the previous case. We have then subtracted the more accurate 
value 

lim„_>oo(2/c)"/i„ = 3.842965603774557 x 10^^ . (4.10) 
obtained and checked exactly with the same procedure as in the previous subsection. 
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C. A model with gaussian distribution 



In order to provide a comparison of the errors distributions of the two previous models, 
we have also generated 10^ numbers using a method designed to give a gaussian distribution. 
Since in the two previous cases we have approximately n* = 52 random processes before the 
value recorded starts stabilizing, we have added 52 random numbers. These random numbers 
have been produced by using repeated multiplication by a large number (7^) followed by a 
reduction modulo 1 (in other words, we drop the integer part). This procedure is inspired 



by results reviewed in Ref. [^. Iterating this procedure, we generate a sequence of random 
numbers which we expect to be uniformly distributed between and 1. In order to get 
numbers distributed between -1 and 1, we multiply each number by 2 and subtract 1. 
Finally, in order to get numbers approximately of the same order as the numbers of the 
other two sets, we have multiplied the final sum of 52 random numbers by 1000. The final 
results depends only on the initial number provided (the "seed"). We have repeated this 
calculation for 10^ values of the seed between and 1. 

The above discussion can be summarized as follows. We iterate 52 times the map 

a^^i = 7^an Modulo 1, (4.11) 

and then calculate 

52 

= 1000 X ^(2a„ - 1) . (4.12) 

n=l 

The sub- index j corresponds to different values of the seed ao- The data analyzed in the 
next sections corresponds the choice 

ao(j) =j7100005.23 , (4.13) 

with J = 1, . . . 10^. The choice of the denominator is motivated by the fact that we want 
a spacing between successive initial values slightly smaller than 10~^. The decimal points 
(.23) have been adjusted empirically in order to get a good uniformity of in subsamples (this 
question is discussed in section |V1B|). 



The map of Eq. ( |4.11| ) is designed to provide a sample from a variable a which we 
expect to be uniformly distributed between and 1. If this is the case, we should have 
((2a -1)) = and Var(2a-1) = 1/3. We use the common notation Var (A) = {{A-{A)f). 
If we now define, X = 1000 x I].^Li(2an — 1), we expect 

(X) = ; (4.14) 
Var(X) = (1000)2 x (52/3) - (4163)^ . 

These predictions will be tested in section [V^. 



V. ERROR DISTRIBUTIONS 

In order to get a first idea about what can be done to characterize the distributions of 
values obtained by the procedures described in the previous section, we first consider the 
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hierarchical model and display a subsample of 1000 data points for s below Sexact = 2/^0 
and 1000 data points for s above this value. The selection of the subsample was done by 
taking one out of every 100 values out of the original sample. More precisely, the integer 
m used in the parametrization of s given in subsection [IV A| , takes only values which are 
multiples of 100. The distribution of values is shown in Fig. |I|. 

One immediately realizes that the distribution is not symmetric about s = 2/-\/c. If 
s > 2/ y/c, the values of x are more spread than if s < 2/ y/c. An histogram can be obtained 
by dividing the data points into "horizontal bins" of equal vertical height and counting the 
number of data points in each bin. This procedure has been followed for 50,000 data points 
with s > and 50,000 data points with s < Ij Each of the two sets is obtained by 

taking every other point out of the data for the hierarchical model described in the previous 
section. With this procedure the total number of data points is still 10^ and the statistical 
properties can be easily compared with the other samples of 10^ data points. The histogram 
is displayed in Fig. ^ 

The solid line is obtained by plugging the estimated mean and standard deviation (dis- 
cussed in the next section) in a gaussian probability distribution. In the rest of this section, 
the terminology "Gaussian fit" refers to this procedure. One sees that there are large de- 
viations from the Gaussian fit. Given the size of the sample it is very unlikely that the 
deviations can be interpreted as statistical fluctuations. 

For comparison, we give in Fig. ^ an histogram of the distribution of 10^ data points 
obtained from the random number generator discussed in section [IV C|. The features of the 



distribution can be seen better by plotting the logarithm of the number of data points in 
each bin. In this semi- log plot, the fit corresponding to a Gaussian distribution is simply a 
parabola. Also, we will study separately, the data points with s < 2/a/c and s > 2/y/c since 
from Fig. ^ they have manifestly different standard deviations. The results for the two sets 



of 10 data points described in subsection IV A for the hierarchical model are shown in Fig 



These distributions are roughly of rectangular shape with slow modulations on the "fiat" 
part. They can be compared with a similar graph for the random number generator (Fig. 

On this graph, the departure from the parabolic behavior is barely visible except in the 
tails. Obviously, bins with zero data points cannot be shown in such a graph. 

Finally, we have displayed the results for the two sets of 10^ data points described in 
subsection |VB] for the toy model (Fig. ^). One sees that significant deviations from the 



Gaussian fit appear in the tails. The fact that the probability distribution falls more rapidly 
in this region is indicative of a probability distribution of the form exp(— ax^ — bx"^) with 
a and b positive. This relative simplicity suggests that we have a better chance to fully 
understand this simplified example. 

VI. MOMENT ANALYSIS 

A. Moment estimators 

A probability distribution is characterized by its moments. We would like to estimate 
the first moments of the distributions described in the previous sections assuming that they 
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are sample of a unique probability distribution. Let Xj with i, ... X be a set of independent 
and identically distributed (i.i.d.) random variables with moments 

{Xi) = /i ; (6.1) 
((X,-/i)0=/i. ,r>2, 

identical for any i. In particular, /i2 = o"^ is the variance or the square of the standard 
deviation a. We define the estimators 

N 



/i = (l/iV)^X, ; 
1 

N 

frr = {l/N)J2iX,-nr ,r>2. (6.2) 



Using the hypothesis that the random variables are independent, we can factor the powers 
of a given Xj and calculate its average independently. For instance, ii i ^ j, (X^Xj) = 
{Xi){Xj). Using the hypothesis that the random variables are identically distributed, we 
can use Eq. (6^) to express these expectations in terms of the common values of their 



moments. Using these rules, one obtains 

(fl) = fi] (6.3) 
(jTr) = fir + 0{l/N) ,r>2 . (6.4) 

The biases of the fir are of order 1/N. It is not difficult to remove the bias, however since 
we will work with large samples, the corrections are very small. 

In Table I, we give the estimated values of fi and a for the distributions discussed before. 
The abbreviation a. is short for "above" which means the 10^ points with s > for 
the hierarchical model (H.M.) ot the toy model (T.M.). Similarly b. is short for "below" 
(s < 2/y^), while a.+b. means "above and below" which is short for 10^ points obtained 
by taking every other value in the "above" and "below" set as explained at the beginning 
of section |V|. Finally, "R.N.G." is short for the random number generator described in 
subsection P^V Q These notations will be used again in the following. We emphasize that 
a.+b. is not the union of a. and b. (so fia.+b. 7^ (l/2)(/^a. + yUfe.)) and that all the samples 
have 10^ data points. 

Initially we expected that yU would be of order o/\/N. However, this is only the case 
for the random number generator. In all the other cases, /i, which we remind is the average 
difference with respect to the accurate value, is of the same order of magnitude as cr. In 
other words, it seems impossible to get rid of the errors by using large statistics! 

In order to check the compiler-dependence of our results, we have repeated the calcu- 
lations of the toy model in C, Mathematica and a different version of Fortran. We found 
that yU was only affected by less than 10 percent. On the other hand, a varied much more 
significantly. 

Note also that in the case of the R.N.G., the estimated value of a is close to the result 
(4163) obtained in Eq. ( |4.15| ) by assuming that the ai are uniform over the interval [0, 1]. 

As expected, the a "above" and "below" are significantly different. It is very unlikely 
that the data "above and below" is a sample from a unique distribution. More generally, 
one could study subsamples and check that the fluctuations are compatible with their size. 
This is the topic of the next subsection. 
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B. Fluctuations and uniformity 



In the previous subsection, we have used moments estimators which are in good approxi- 
mations unbiased. However, it is not clear that the sample comes from a unique distribution. 
One way to tackle this question is to consider the estimators in many subsamples and de- 
cide if the the fluctuations of the estimated values of the moments in the subsamples are 
compatible with the variance of the estimator which we now proceed to estimate. 

Using again the hypothesis of independent and identical distributions, we find in leading 
order in 1/N (in other words, up to 0(l/iV^) corrections) that: 

Var{Ji) = (l/iV)(/X2) ; (6.5) 

yar(/25) = (l/iV)(^4-/i^) ; (6.6) 

Var{-prz) = (l/iV)(/i6 - /is + " 6/12/^4) ; (6.7) 

Var{iM) = (l/iV)(/i8 -1^1 + 16/i2/i3 - S/igyUg) . (6.8) 



The proof of these results can be found for instance in Ref. |jT2[ . 

We considered the seven sets of 10^ data points discussed in the previous subsection. 
Each set has been divided into m subsets with Nb data points. Obviously, N = m x Nb- 
The partition has been done by putting together successive values of s. If one refers to Fig. 
|T|, the subsets are vertical partitions. We call /I^ (or jTs) the estimator of fir (or /i) in the 
S-th subsample. These are defined as in Eq. ( |6.2|) , except for the fact that needs to be 
replaced by Nb- In order to avoid confusion, we have used a subscript T (short for Total) to 
designate the estimators in the whole sample. We have studied the differences between the 
subsample estimates and the whole sample estimates. In order to get comparable answers, 
we have expressed these differences in "natural units" . These units should be such that if we 
had a sample from a unique distribution, all the fluctuations would be of order 1. From the 
expressions of the variances, ones sees that a^/y/Ns are natural units for the fluctuations 
of fi^s- In practice, we have to replace a by its estimated value. As an illustration, we have 
taken three sets previously abbreviated as H.M.a.-|-b., T.M.a.+b. and R.N.G. and divided 
them into m = 400 sub-bins. These sub-bins are ordered according to increasing values of s. 
For illustration, we have displayed the fluctuations of the second moment in natural units. 
The results are shown in Fig. |^. 

One sees that for the first two sets, there is a "jump" at s = 2/a/c that is significantly 
larger than the other fluctuations. Note that we have checked that both a.-l-b. sets had 
their sub-bins ordered in the same way. Besides these jumps, the fluctuations appear to be 
of much more "normal" size. In order to make this visual impression quantitative, we have 
defined the average of the square of the fluctuations in the subsamples: 

m 

Ui = {i/m)J2{J^s - ; 

5=1 
m 

Ur = (1/m) ^ (/!;> - fhTrf . (6.9) 
5=1 

If the sample comes from a single distribution, we expect these quantities to be equal to 
the variance of the corresponding estimator with fluctuations of order 1 / ^/m. We have thus 
defined some "indicators of uniformity" as 
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Pi = Ui/Ym:{Jis) ] 
Pr = ?7r/Var(/i;r5) . 



(6.10) 



If the sample comes from a single distributions, the Pi should be close to 1, while a large 
value indicates the presence of several distributions. The values of these indicators are given 
for the seven sets of data points previously described in Table II. 

First, one notices that the R.N.G. data has all the pi within less than ten percent of 1. 
On the other hand, both a.+b. sets have large values. For the toy model, the separation 
into above and below is sufficient to get rid of these large values. The values for T.M.a. 
and T.M.b are within 20 percent of 1. This seems in good approximation compatible with a 
single distribution. This quantitative analysis confirms the visual impression that one gets 
from Figs. ^ and On the other hand, it is possible to further "resolve" the a. and b. data 
for the hierarchical model. 

In the case of the hierarchical model, the local analysis of the sub-samples shows that 
the rapid variations of the mean has "large-scale" tendencies. Namely we observed regions 
of s where in average, the mean grows linearly when s increases followed by regions where 
it decreases. This behavior is presently under study. 

C. Departure from Gaussian behavior 

In section 0, we noticed that the distributions for the toy model a. and b. were more 
sharply peaked than a Gaussian distribution. This feature can be analyzed more quantita- 
tively by estimating the skewness coefficient 1^3/^2^'^ and the coefficient of kurtosis fi^/ jj,l — 3. 
For a Gaussian distribution, both coefficients are zero. These estimations of these coefficients 
for the data sets previously considered are given in Table III. 

For the T.M. a. and b., both coefficients are an order of magnitude larger than the 
corresponding coefficients for the R.N.G. . Larger deviations are observed for the other sets 
reflecting the lack of uniformity of these sets of data. 

In summary, the above table supports the general statement that the numerical errors 
appearing in RG calculations are non-Gaussian. We will now attempt to explain this general 
feature with a simple model. 

VII. A MODEL FOR NON-GAUSSIAN ERRORS 

One reason to be inclined to believe a-priori that errors distributions should be Gaussian 
is the famous central-limit theorem which asserts that the average of N i.i.d. random 
variables approaches a Gaussian distribution when N becomes inflnite. In this section, we 
give an heuristic derivation of this theorem and then present a mechanism by which we can 
avoid ending up with Gaussian distributions. 

A. The central limit theorem 

We flrst review the central-limit theorem in terms of a moment analysis. Let aj j = 
1, ... n be a set of i.i.d. random variables with arbitrary moments (/i, /i2 = o"^, yUs • • •)• 
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define 

n 

Z=[{l/n)Y^{a,-^^)]/{a/V^) . (7.1) 
i=i 

It is clear that Z has fi = and a = 1. The central-limit asserts that 

Umn->ooP{Z) = (l/v^)e-^'/2 (7 2) 

This theorem can be proven by showing that the moments of Z coincide with the moments 
of a normal Gaussian : 

/i2. = (2/-1)!! ; (7.3) 
/i2r+l = . (7.4) 

The calculation of the expected values of Z'^ goes as follows. Each term of Z'^ is a product 
of q terms of the form (a^ — fi) . The expected value can be factored into products involving 
the same i and then expressed in terms of the moments. Since (oj — /i) = 0, each i should 
be repeated at least twice. For Z^, the three i should be the same or one will be "left alone" 
(which would imply a vanishing average). Since the three indices have to be the same and 
since there are n indices we get a combinatoric factor n which is insufficient to overcome the 
ra"^/^ appearing in Z^. A similar reasoning in the case of Z^, shows there are three ways 
to arrange four indices into two pairs having the same index (assuming that the index of 
the two pairs are distinct). If we require that the four indexes are the same, it costs an 
additional 1/n suppression. In summary, when n becomes large 

< Z^ >= n^3/((TV/2)- > ; (7.5) 
< Z^ >= (n/i4 + 3n(n - l)(T^)/(aV)- > 3 . (7.6) 

The argument generalizes easily, and one realizes that the odd moments vanish and the even 
moment reduce to the number of unordered pairs, in agreement with Eq. ( [7.3[ ). 



B. Evading the central limit 

In order to see how the central-limit can be evaded, we have constructed a simplified 
model for the numerical errors. In the rest of this section, when we say "the quantity we 
are calculating numerically" this means the Rn{k) function defined in Eq. ( [4.1| ) for the 
hierarchical model or the quantity a„ of Eq. ( [4.9|) for the toy model. While we are near the 
fixed point, the numerical values appearing in these quantities which are relevant for our 
discussion are roughly of order 1. 

We first assume that the initial data {Rq or oq) has an error Sq in the unstable direction. 
Using the linearized model of section |T|, we see from Eq. (|3.1| ) that as far as we are near the 
fixed point, the quantity we are calculating numerically is of order Ki. After one iteration. 
So is amplified by a factor A In addition, an error of order i^ilO"^^ is made. We express our 
ignorance about the details of the round-off errors in terms of a random variable ai which 
takes positive or negative values of order 1 and write the new error as i^iailO"^^. Putting 
the two terms we get 
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= A^o + KiailQ-^^ . (7.7) 

Iterating n times, we get 

n 

^„ = iri(^A"-%,)10-i6 . (7.8) 

3=0 

In order to have compact notations, we have introduced a random variable such that 
Sq = A'laolO^^^. This term is not very important for the rest of the discussion, we might 
have set Sq = and considered errors for a given initial data expressed with a finite precision. 
In that case, the sum in Eq. { \l.8\) would run from 1 to n. 

The main difference between the expression of the error of Eq. ( [7. 8] ) and the variable 
Z introduced in the discussion of the central-limit theorem, is that the appear with 
different weights. One can then escape the "pair dominance" mechanism which puts all the 
individual errors on the same footing. Instead, the short- distance errors (small n) are more 
amplified than the large- distance errors (large n). Consequently, the total error is likely to 
inherit some of the features of the individual errors. This intuitive picture is confirmed by 
an explicit calculation that we now proceed to explain. 

We will calculate the skewness and kurtosis associated with 

Y = ± V-^a, , (7.9) 

3=0 

assuming that all the are independent and identically distributed. A detailed study of the 
errors associated with one iteration of the RG transformation for the toy model shows that 
this assumption is not totally correct but that it is a reasonable first-order approximation. 
A simple calculation shows that after neglecting terms of order 1 compared to A", we obtain 

<{Y- <Y >f > /(< {¥- < Y >)2 ~ [(A' - lfy{X^ - l)](/iVa3/') ; (7.10) 

[< {Y- <Y>)^> /(< {Y- < Y >)2 >)2] - 3 ~ [(A' - 1)V(A^ - iMi^^a^) - 3] . (7.11) 

One sees that Y inherits the non-Gaussian behavior of the a's, in the sense that the skewness 
and kurtosis coefficients of Y are proportional to those of a. 

For A = 1.427 and a uniformly distributed between -1 and 1 (corresponding to an error 
of ±1 in the last significant digit) and which is roughly what we observed for the toy model, 
the skewness and kurtosis coefficients of Y are and -0.41 respectively. These numbers are 
consistent with the values obtained for the toy model. A more detailed study shows that if 
we replace the average in Eq. ( [4. 121) for the random number generator by a weighted average 



as in Eq. (|7.8|) , one obtains histograms very similar to those of T.M. a. or b. (see Fig. 



VIII. CONCLUSIONS 

We have shown that in RG calculations, perturbations occurring at different scales fail 
to average each other out and that the final result is most affected by the peculiarities of the 
short- distance perturbations. In our study the perturbations are "anthropomorphic" in the 
sense that they are due to our calculation procedure rather then to natural phenomena. We 



15 



have also shown that it is not possible to get rid of the effects by averaging over calculations 
at slightly different values of the rescaling parameter s (which is not known exactly in most 
realistic situations). The only way the errors can be reduced to an acceptable level is by 
using enough significant digits in the arithmetic used in the calculations. 

The situations is somehow similar to what occurs in chaotic dynamical systems. There, 
the sensitive dependence on the initial conditions, implies that for time scales large com- 
pared to the inverse Lyapounov's exponents, one needs more significant digits than possibly 
achievable in order to keep track of a particular orbit. In the calculations performed here, 
the product of the inverse Lyapounov exponent (log(A)) and the time of calculation (n*) 
grows only like (2/7)log(A/m) and one can obtain reasonably precise calculations of the 
renormalized mass with (A/m) ~ 10^'^ by using only 40 significant digits (for 7 = 1). If we 
are interested in calculating the connected higher point functions, more digits are necessary 
because some digits are lost in the subtraction procedure as explained in Ref. @]. In conclu- 
sions, we see that if we are interested in the lowest point connected functions and if we have 
hierarchies of only 17 order of magnitudes, there is no serious limitation in our capability to 
calculate using the RG method. 

Finally, it should be noteds that non-Gaussian distributions are a common feature in the 
study turbulence |]T3[. A simple example is the distribution of transverse velocity increments 
between two points separated by a distance / in a turbulent jet, denoted 6v±{l). Typically, 
the distribution starts developing "long tails" when / becomes sufficiently small. In the case 
studied here, we have the opposite effect: our distribution have "short tails". This is due to 
the fact that the "microscopic" errors are the individual round-offs which have no tails at 
all. 
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FIG. 1. Distribution of x(s)— x^^"'^* for the hierarchical model for various values of the rescaling 
variable s. 
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FIG. 3. Partition of the 10^ values obtained using the random number generator into 200 bins 
as described in the text. The soHd Hne is a Gaussian fit. 
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Hierarchical Model : s<2/c 
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Hierarchical Model : s>2/c^ 
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FIG. 4. Partition of the 10^ values of x(s) for s < Ij^/c and 10^ values for s > into 200 

bins, for the hierarchical model. The logarithm of the number of data points in each bin is plotted 
versus the bin number. The solid parabola is a Gaussian fit. 
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FIG. 5. Partition of the 10^ values obtained from the random number generator described in 
the text into 200 bins. The logarithm of the number of data points in each bin is plotted versus 
the bin number. The solid parabola is a Gaussian fit. 
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Toy Model : s<2/c 
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Toy Model : s>2/c 



1/2 




50 100 150 200 

Bin Index j 



FIG. 6. Partition of the two sets of 10 values described in subsection IV B into 200 bins, for 



the toy model. The logarithm of the number of data points in each bin is plotted versus the bin 
number. The solid parabola is a Gaussian fit. 
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Random Number Generator 
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TABLES 

TABLE I. fi and a for the 7 sets of data considered 





H.M.(a.+b.) 


H.M.a. 


H.M.b. 


T.M.(a.+b.) 


T.M.a. 


T.M.b. 


R.N.G. 




1045 


1297 


803 


-3037 


-3023 


-3029 


4.22 


a 


2123 


2561 


1524 


3163.00 


2062 


3967 


4149 



TABLE II. Values of the pi for the seven sets of data considered. 





H.M.(a+b) 


H.M.a. 


H.M.b. 


T.M.(a+b) 


T.M.a. 


T.M.b. 


R.N.G. 


pl 


7.5 


2.6 


11 


1.02 


0.86 


0.85 


1.057 


p2 


41 


1.16 


2.0 


33 


1.18 


0.98 


0.987 


p3 


4.3 


1.04 


1.66 


0.875 


1.04 


0.94 


0.998 


p4 


26 


1.11 


1.72 


15 


1.20 


0.95 


1.018 



TABLE HI. Skewness and kurtosis coefficients 





H.M.(a+b) 


H.M.a. 


H.M.b. 


T.M.(a+b) 


T.M.a. 


T.M.b. 


R.N.G. 




0.31 


0.12 


0.15 


-0.0033 


-0.0090 


0.0038 


-0.00048 




-0.58 


-1.10 


-0.81 


0.56 


-0.29 


-0.34 


-0.015 
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